Search CORE

40 research outputs found

TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension

Author: Choi Eunsol
Joshi Mandar
Weld Daniel S.
Zettlemoyer Luke
Publication venue
Publication date: 01/01/2017
Field of study

We present TriviaQA, a challenging reading comprehension dataset containing over 650K question-answer-evidence triples. TriviaQA includes 95K question-answer pairs authored by trivia enthusiasts and independently gathered evidence documents, six per question on average, that provide high quality distant supervision for answering the questions. We show that, in comparison to other recently introduced large-scale datasets, TriviaQA (1) has relatively complex, compositional questions, (2) has considerable syntactic and lexical variability between questions and corresponding answer-evidence sentences, and (3) requires more cross sentence reasoning to find answers. We also present two baseline algorithms: a feature-based classifier and a state-of-the-art neural network, that performs well on SQuAD reading comprehension. Neither approach comes close to human performance (23% and 40% vs. 80%), suggesting that TriviaQA is a challenging testbed that is worth significant future study. Data and code available at -- http://nlp.cs.washington.edu/triviaqa/Comment: Added references, fixed typos, minor baseline updat

arXiv.org e-Print Archive

Crossref

Learning to map sentences to logical form

Author: Zettlemoyer Luke S. (Luke Sean), 1978-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2009
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 105-111).One of the classical goals of research in artificial intelligence is to construct systems that automatically recover the meaning of natural language text. Machine learning methods hold significant potential for addressing many of the challenges involved with these systems. This thesis presents new techniques for learning to map sentences to logical form - lambda-calculus representations of their meanings. We first describe an approach to the context-independent learning problem, where sentences are analyzed in isolation. We describe a learning algorithm that takes as input a training set of sentences labeled with expressions in the lambda calculus. The algorithm induces a Combinatory Categorial Grammar (CCG) for the problem, along with a log-linear model that represents a distribution over syntactic and semantic analyses conditioned on the input sentence. Next, we present an extension that addresses challenges that arise when learning to analyze spontaneous, unedited natural language input, as is commonly seen in natural language interface applications. A key idea is to introduce non-standard CCG combinators that relax certain parts of the grammar - for example allowing flexible word order, or insertion of lexical items - with learned costs. We also present a new, online algorithm for inducing a weighted CCG. Finally, we describe how to extend this learning approach to the context-dependent analysis setting, where the meaning of a sentence can depend on the context in which it appears. The training examples are sequences of sentences annotated with lambda-calculus meaning representations.(cont.) We develop an algorithm that maintains explicit, lambda-calculus representations of discourse entities and uses a context-dependent analysis pipeline to recover logical forms. The method uses a hidden-variable variant of the perception algorithm to learn a linear model used to select the best analysis. Experiments demonstrate that the learning techniques we develop induce accurate models for semantic analysis while requiring less data annotate effort than previous approaches.by Luke S. Zettlemoyer.Ph.D

DSpace@MIT

Learning probabilistic relational planning rules

Author: Hanna M. Pasula
Leslie Pack Kaelbling
Luke S. Zettlemoyer
Publication venue
Publication date
Field of study

To learn to behave in highly complex domains, agents must represent and learn compact models of the world dynamics. In this paper, we present an algorithm for learning probabilistic STRIPS-like planning operators from examples. We demonstrate the effective learning of rule-based operators for a wide range of traditional planning domains

CiteSeerX

SpanBERT: Improving Pre-training by Representing and Predicting Spans

Author: Chen Danqi
Joshi Mandar
Levy Omer
Liu Yinhan
Weld Daniel S.
Zettlemoyer Luke
Publication venue
Publication date: 01/01/2020
Field of study

We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text. Our approach extends BERT by (1) masking contiguous random spans, rather than random tokens, and (2) training the span boundary representations to predict the entire content of the masked span, without relying on the individual token representations within it. SpanBERT consistently outperforms BERT and our better-tuned baselines, with substantial gains on span selection tasks such as question answering and coreference resolution. In particular, with the same training data and model size as BERT-large, our single model obtains 94.6% and 88.7% F1 on SQuAD 1.1 and 2.0, respectively. We also achieve a new state of the art on the OntoNotes coreference resolution task (79.6\% F1), strong performance on the TACRED relation extraction benchmark, and even show gains on GLUE.Comment: Accepted at TAC

arXiv.org e-Print Archive

Princeton University Open Access Repository

Genie: A Generator of Natural Language Semantic Parsers for Virtual Assistant Commands

Author: Alvarez-Melis David
Banarescu Laura
Chen David L
Chu Shumo
Ganitkevitch Juri
Kate Rohit J
Kingma Diederik P
Pasupat Panupong
Quirk Chris
Shetty Jitesh
Steedman Mark
Trakhtenbrot Boris A.
Wang Yushi
Wong Yuk Wah
Xu Xiaojun
Zelle John M
Zettlemoyer Luke S
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/04/2019
Field of study

To understand diverse natural language commands, virtual assistants today are trained with numerous labor-intensive, manually annotated sentences. This paper presents a methodology and the Genie toolkit that can handle new compound commands with significantly less manual effort. We advocate formalizing the capability of virtual assistants with a Virtual Assistant Programming Language (VAPL) and using a neural semantic parser to translate natural language into VAPL code. Genie needs only a small realistic set of input sentences for validating the neural model. Developers write templates to synthesize data; Genie uses crowdsourced paraphrases and data augmentation, along with the synthesized data, to train a semantic parser. We also propose design principles that make VAPL languages amenable to natural language translation. We apply these principles to revise ThingTalk, the language used by the Almond virtual assistant. We use Genie to build the first semantic parser that can support compound virtual assistants commands with unquoted free-form parameters. Genie achieves a 62% accuracy on realistic user inputs. We demonstrate Genie's generality by showing a 19% and 31% improvement over the previous state of the art on a music skill, aggregate functions, and access control.Comment: To appear in PLDI 201

arXiv.org e-Print Archive

Crossref

Introduction to the special issue on learning semantics

Author: A. Bordes
A. Bordes
A. Farhadi
A. Gupta
Antoine Bordes
C. H. Lampert
C. Matuszek
D. Goldwasser
D. L. Chen
D. Lewis
Dan Roth
H. Poon
J. Clarke
J. Mitchell
J. Zelle
Jason Weston
L. Zettlemoyer
Luke Zettlemoyer
Léon Bottou
P. Liang
R. Collobert
R. J. Kate
R. Socher
Ronan Collobert
S. Miller
T. Cour
Y. Artzi
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref